Experimental Evaluation of Multi-Round Matrix Multiplication on MapReduce
نویسندگان
چکیده
This paper proposes an Hadoop library, named M3, for performing dense and sparse matrix multiplication in MapReduce. The library features multi-round MapReduce algorithms that allow to tradeoff round number with the amount of data shuffled in each round and the amount of memory required by reduce functions. We claim that multi-round MapReduce algorithms are preferable in cloud settings to traditional monolithic algorithms, that is, algorithms requiring just one or two rounds. We perform an extensive experimental evaluation of the M3 library on an in-house cluster and on a cloud provider, aiming at assessing the performance of the library and at comparing the multi-round and monolithic approaches. Keywords—MapReduce, Hadoop, multi-round algorithms, matrix multiplication, experiments, cloud
منابع مشابه
Co-processing SPMD Computation on GPUs and CPUs on Shared Memory System
Heterogeneous parallel system with multi processors and accelerators are becoming ubiquitous due to better cost-performance and energy-efficiency. These heterogeneous processor architectures have different instruction sets and are optimized for either task-latency or throughput purposes. Challenges occur in regard to programmability and performance when executing SPMD computations on heterogene...
متن کاملUpper and Lower Bounds on the Cost of a Map-Reduce Computation
In this paper we study the tradeoff between parallelism and communication cost in a map-reduce computation. For any problem that is not “embarrassingly parallel,” the finer we partition the work of the reducers so that more parallelism can be extracted, the greater will be the total communication between mappers and reducers. We introduce a model of problems that can be solved in a single round...
متن کاملBenchmark Hadoop and Mars: MapReduce on cluster versus on GPU
MapReduce[5] is an emerging programming model that utilizes distributed processing elements (PE) on large datasets. With this model, programmers can write highly parallelized code without explicitly dealing with task scheduling and code parallelism in distributed systems. In this paper, we comparatively evaluate the performance of MapReduce model on Hadoop[2] and on Mars[3]. Hadoop is a softwar...
متن کاملA New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure
The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...
متن کامل